Simple Variable Length N-grams for Probabilistic Automata Learning

نویسندگان

  • Fábio Natanael Kepler
  • Sérgio Luis Sardi Mergen
  • Cléo Zanella Billa
چکیده

This paper describes an approach used in the 2012 Probabilistic Automata Learning Competition. The main goal of the competition was to obtain insights about which techniques and approaches work best for sequence learning based on different kinds of automata generating machines. This paper proposes the usage of n-gram models with variable length. Experiments show that, using the test sets provided by the competition, the variable-length approach works better than fixed 3-grams.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Notes on Learning Probabilistic Automata

Alberto Apostolico y Probabilistic models of various classes of sources are developed in the context of coding and compression as well as in machine learning and classi cation. In the rst domain, the repetitive structures of substrings are regarded as redundancies and sought to be removed. In the second, repeated subpatterns are unveiled as carriers of information and structure. In both context...

متن کامل

A Link Prediction Method Based on Learning Automata in Social Networks

Nowadays, online social networks are considered as one of the most important emerging phenomena of human societies. In these networks, prediction of link by relying on the knowledge existing of the interaction between network actors provides an estimation of the probability of creation of a new relationship in future. A wide range of applications can be found for link prediction such as electro...

متن کامل

N-Gram Feature Selection for Authorship Identification

Automatic authorship identification offers a valuable tool for supporting crime investigation and security. It can be seen as a multi-class, single-label text categorization task. Character n-grams are a very successful approach to represent text for stylistic purposes since they are able to capture nuances in lexical, syntactical, and structural level. So far, character n-grams of fixed length...

متن کامل

Using an Evaluator Fixed Structure Learning Automata in Sampling of Social Networks

Social networks are streaming, diverse and include a wide range of edges so that continuously evolves over time and formed by the activities among users (such as tweets, emails, etc.), where each activity among its users, adds an edge to the network graph. Despite their popularities, the dynamicity and large size of most social networks make it difficult or impossible to study the entire networ...

متن کامل

Learnability of Probabilistic Automata via Oracles

Efficient learnability using the state merging algorithm is known for a subclass of probabilistic automata termed μ-distinguishable. In this paper, we prove that state merging algorithms can be extended to efficiently learn a larger class of automata. In particular, we show learnability of a subclass which we call μ2-distinguishable. Using an analog of the Myhill-Nerode theorem for probabilisti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012